115 research outputs found

    Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes

    Get PDF
    The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of computing resources. The pressure to maintain reasonable levels of performance and portability forces application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic scheduler for modern hierarchical manycore architectures. In this paper, we study the benefits and limits of replacing the highly specialized internal scheduler of the PaStiX solver with two generic runtime systems: PaRSEC and StarPU. The tasks graph of the factorization step is made available to the two runtimes, providing them the opportunity to process and optimize its traversal in order to maximize the algorithm efficiency for the targeted hardware platform. A comparative study of the performance of the PaStiX solver on top of its native internal scheduler, PaRSEC, and StarPU frameworks, on different execution environments, is performed. The analysis highlights that these generic task-based runtimes achieve comparable results to the application-optimized embedded scheduler on homogeneous platforms. Furthermore, they are able to significantly speed up the solver on heterogeneous environments by taking advantage of the accelerators while hiding the complexity of their efficient manipulation from the programmer.Comment: Heterogeneity in Computing Workshop (2014

    A NUMA Aware Scheduler for a Parallel Sparse Direct Solver

    Get PDF
    Over the past few years, parallel sparse direct solvers made significant progress and are now able to solve efficiently industrial three-dimensional problems with several millions of unknowns. To solve efficiently these problems, PaStiX and WSMP solvers for example, provide an hybrid MPI-thread implementation well suited for SMP nodes or multi-core architectures. It enables to drastically reduce the memory overhead of the factorization and improve the scalability of the algorithms. However, today's modern architectures introduce new hierarchical memory accesses that are not handle in these solvers. We present in this paper three improvements on PaStiX solver to improve the performance on modern architectures : memory allocation, communication overlap and dynamic scheduling and some results on numerical test cases will be presented to prove the efficiency of the approach on NUMA architectures

    A Comparative Study of Point Processes for Line Network Extraction in Remote Sensing

    Get PDF
    We present in this report a comparative study between models of line network extraction, within a stochastic geometry framework. We rely on the theory of marked point processes specified by a density with respect to the uniform Poisson process. We aim to determine which prior density is the most relevant for road network detection. The "Candy" model, introduced in [21] for the extraction of road networks, is used as a reference model. This model is based on the idea that a road network can be thought of as a realization of a Markov object process, where the objects correspond to interacting line segments. We have developed two variants of this model which use quality coefficients for interactions. The first of these two variants is a generalization of the "Candy" model and the second one is an adaptation of the "IDQ" model proposed in [13] for the problem of building extraction from digital elevation models. The optimization is achieved by a simulated annealing with a RJMCMC algorithm. The experimental results, obtained for each model on aerial or satellite images, show the interest of adding quality coefficients for interactions in the prior density

    A Polyline Process for Unsupervised Line Network Extraction in Remote Sensing

    Get PDF
    This report presents a new stochastic geometry model for unsupervised extraction of line networks (roads, rivers, etc.) from remotely sensed images. The line network in the observed scene is modeled by a polyline process, named CAROLINE. The prior model incorporates strong geometrical and topological constraints through potentials on the polyline shape and interaction potentials. Data properties are taken into account through a data term based on statistical tests. Optimization is done via a simulated annealing scheme using a Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm, without any specific initialization. We accelerate the convergence of the algorithm by using appropriate proposal kernels. Experimental results are provided on aerial and satellite images and compared with the results obtained with a previous model, that is a segment process called "Quality Candy"

    Non regression testing for the JOREK code

    Get PDF
    Non Regression Testing (NRT) aims to check if software modifications result in undesired behaviour. Suppose the behaviour of the application previously known, this kind of test makes it possible to identify an eventual regression, a bug. Improving and tuning a parallel code can be a time-consuming and difficult task, especially whenever people from different scientific fields interact closely. The JOREK code aims at investing Magnetohydrodynamic (MHD) instabilities in a Tokamak plasma. This paper describes the NRT procedure that has been tuned for this simulation code. Automation of the NRT is one keypoint to keeping the code healthy in a source code repository.Comment: No. RR-8134 (2012

    Seasonal contrast: Unsupervised pre-training from uncurated remote sensing data

    Get PDF
    Remote sensing and automatic earth monitoring are key to solve global-scale challenges such as disaster prevention, land use monitoring, or tackling climate change. Although there exist vast amounts of remote sensing data, most of it remains unlabeled and thus inaccessible for supervised learning algorithms. Transfer learning approaches can reduce the data requirements of deep learning algorithms. However, most of these methods are pre-trained on ImageNet and their generalization to remote sensing imagery is not guaranteed due to the domain gap. In this work, we propose Seasonal Contrast (SeCo), an effective pipeline to leverage unlabeled data for in-domain pre-training of remote sensing representations. The SeCo pipeline is composed of two parts. First, a principled procedure to gather large-scale, unlabeled and uncurated remote sensing datasets containing images from multiple Earth locations at different timestamps. Second, a self-supervised algorithm that takes advantage of time and position invariance to learn transferable representations for remote sensing applications. We empirically show that models trained with SeCo achieve better performance than their ImageNet pre-trained counterparts and state-of-the-art self-supervised learning methods on multiple downstream tasks. The datasets and models in SeCo will be made public to facilitate transfer learning and enable rapid progress in remote sensing applications.Peer ReviewedObjectius de Desenvolupament Sostenible::15 - Vida d'Ecosistemes Terrestres::15.2 - Per a 2020, promoure la gestió sostenible de tots els tipus de boscos, posar fi a la desforestació, recuperar els boscos degradats i incrementar substancialment la repoblació forestal i la reforestació a escala mundialObjectius de Desenvolupament Sostenible::15 - Vida d'Ecosistemes Terrestres::15.3 - Per a 2030, lluitar contra la desertificació, rehabilitar les terres i els sòls degradats, incloses les terres afec­tades per la desertificació, la sequera i les inundacions, i procurar assolir un món neutral quant a la degradació de les terresObjectius de Desenvolupament Sostenible::15 - Vida d'Ecosistemes Terrestres::15.9 - Per a 2020, integrar els valors dels ecosistemes i de la biodiversitat a la planificació nacional i local i als processos de desenvolupament, així com a les estratègies i als informes de reducció de la pobresaObjectius de Desenvolupament Sostenible::15 - Vida d'Ecosistemes TerrestresPostprint (author's final draft

    Hydrographic Network Extraction from Radar Satellite Images using a Hierarchical Model within a Stochastic Geometry Framework

    Get PDF
    This report presents a two-step algorithm for unsupervised extraction of hydrographic networks from satellite images, that exploits the tree structures of such networks. First, the thick branches of the network are detected by an efficient algorithm based on a Markov random field. Second, the line branches are extracted using a recursive algorithm based on a hierarchical model of the hydrographic network, in which the tributaries of a given river are modeled by an object process (or a marked point process) defined within the neighborhood of this river. Optimization of each point process is done via simulated annealing using a reversible jump Markov chain Monte Carlo algorithm. We obtain encouraging results in terms of omissions and overdetections on a radar satellite image

    Extraction de réseaux linéiques à partir d'images satellitaires par processus Markov objet

    Get PDF
    Cet article présente une méthode d'extraction non supervisée des réseaux linéiques, tels que les réseaux routiers ou les réseaux hydrographiques, à partir d'images satellitaires. Nous modélisons le réseau linéique présent dans l'image par un processus Markov objet. La densité de ce processus a été construite de façon à exploiter au mieux la topologie du réseau recherché et les propriétés radiométriques des données. Un algorithme de type Monte Carlo par chaînes de Markov à sauts réversibles est proposé pour l'optimisation. Les résultats expérimentaux obtenus pour différentes images révèlent la capacité du modèle à fournir un réseau continu, avec de faibles taux d'omissions et de fausses alarmes

    Sparse direct solvers with accelerators over DAG runtimes

    Get PDF
    The current trend in the high performance computing shows a dramatic increase in the number of cores on the shared memory compute nodes. Algorithms, especially those related to linear algebra, need to be adapted to these new computer architectures in order to be efficient. PASTIX is a sparse parallel direct solver, that incorporates a dynamic scheduler for strongly hierarchical modern architectures. In this paper, we study the replacement of this internal highly integrated scheduling strategy by two generic runtime frameworks: DAGUE and STARPU. Those runtimes will give the opportunity to execute the factorization tasks graph on emerging computers equipped with accelerators. As for previous work done in dense linear algebra, we present the kernels used for GPU computations inspired by the MAGMA library and the DAG algorithm used with those two runtimes. A comparative study of the performances of the supernodal solver with the three different schedulers is performed on manycore architectures and the improvements obtained with accelerators are presented with the STARPU runtime. These results demonstrate that these DAG runtimes provide uniform programming interfaces to obtain high performance on different architectures on irregular problems as sparse direct factorizations
    • …
    corecore